Skip to content

build: wasm: filter_wasm: Optimize filter wasm plugin for high throughput#10818

Merged
edsiper merged 2 commits intomasterfrom
cosmo0920-optimize-filter_wasm-plugin-for-high-throughput
Sep 2, 2025
Merged

build: wasm: filter_wasm: Optimize filter wasm plugin for high throughput#10818
edsiper merged 2 commits intomasterfrom
cosmo0920-optimize-filter_wasm-plugin-for-high-throughput

Conversation

@cosmo0920
Copy link
Contributor

@cosmo0920 cosmo0920 commented Sep 2, 2025

Loading once and execute on every cycle on already instantiated wasm program.
With this change, an great effect was also happened.
It makes wasm programs be able to run on Windows.
So, I remove a feature gate for x86_64 and x86 Windows.
I confirmed that filter_wasm plugin is now working on Windows.
But ARM64 Windows seems not to be supported on WAMR library.
It is only enabled for PC platform that includes x86_64 and x86.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • Bug Fixes

    • Improved stability of the WASM filter by reusing a persistent instance and guarding when unavailable.
    • Clearer error messages and handling when the WASM path is inaccessible.
  • Refactor

    • Streamlined the WASM filter lifecycle to initialize once and clean up on exit, reducing per-call overhead.
  • Chores

    • Windows builds now auto-detect CPU architecture and enable/disable WASM features accordingly (x64/x86 enabled, ARM64 disabled), with the detected architecture logged.

cosmo0920 and others added 2 commits September 2, 2025 19:00
Instantiate Wasm programs only once and execute with the instantiated instance.

Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <cosmo0920.oucc@gmail.com>
@coderabbitai
Copy link

coderabbitai bot commented Sep 2, 2025

Walkthrough

Adds Windows architecture detection in CMake and gates WASM-related flags based on detected arch. Refactors the WASM filter to instantiate a single persistent runtime per filter instance, reuse it for processing, add guards/logging, adjust error paths to avoid destroying the persistent instance, and update exit to clean up the single instance.

Changes

Cohort / File(s) Summary
Windows CMake arch gating
cmake/windows-setup.cmake
Detects FLB_ARCH from CMAKE_SYSTEM_PROCESSOR and pointer size; defaults to x64 if unknown. Sets FLB_WASM and FLB_FILTER_WASM based on arch: arm64=No, x64=Yes, x86=Yes. Applies these in Windows defaults instead of unconditional enabling.
WASM filter persistent instance
plugins/filter_wasm/filter_wasm.c, plugins/filter_wasm/filter_wasm.h
Adds persistent struct flb_wasm *wasm to filter context. Initializes and stores a single instance in cb_wasm_init; reuses in cb_wasm_filter. Adds guard returning FLB_FILTER_NOTOUCH if missing. Introduces error path that avoids destroying the persistent instance. Updates cb_wasm_exit to destroy only the stored instance. Adds logging for path/access and init.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant App as Fluent Bit
  participant Filter as WASM Filter
  participant RT as WASM Runtime

  rect rgba(200,220,255,0.25)
  note over App,Filter: Initialization
  App->>Filter: cb_wasm_init(config)
  Filter->>RT: configure + instantiate(module)
  RT-->>Filter: instance handle
  Filter->>Filter: ctx.wasm = instance
  end

  rect rgba(220,255,220,0.25)
  note over App,Filter: Pre-run (access check)
  App->>Filter: cb_wasm_pre_run()
  Filter-->>App: log if wasm_path inaccessible
  end

  rect rgba(255,245,200,0.35)
  note over App,Filter: Filtering records
  App->>Filter: cb_wasm_filter(chunk)
  alt ctx.wasm present
    Filter->>RT: process(chunk) using ctx.wasm
    RT-->>Filter: result
    Filter-->>App: modified or notouch
  else missing instance
    Filter-->>App: FLB_FILTER_NOTOUCH
  end
  end

  rect rgba(255,220,220,0.35)
  note over Filter: Error handling (changed)
  Filter->>Filter: on encoding/format failure
  note right of Filter: Jump to error path\nwithout destroying ctx.wasm
  end

  rect rgba(230,230,255,0.35)
  note over App,Filter: Shutdown
  App->>Filter: cb_wasm_exit()
  Filter->>RT: destroy ctx.wasm + config
  Filter-->>App: cleanup done
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

I tuned my CMake ears to arches’ tread,
x64 hops forward, arm64 to bed.
One WASM burrow, warm and bright,
reuse the warren, night to night.
No more fleeting instants—stay!
Clean on exit, logs to say.
Thump-thump—ship day! 🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch cosmo0920-optimize-filter_wasm-plugin-for-high-throughput

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
plugins/filter_wasm/filter_wasm.c (2)

197-205: Potential bug: using strlen() on msgpack payload.

ret_val is binary msgpack; strlen() will truncate at NUL and corrupt output.

Prefer an API that returns both pointer and length:

-                ret = flb_log_event_encoder_set_body_from_raw_msgpack(
-                        &log_encoder,
-                        ret_val,
-                        strlen(ret_val));
+                size_t ret_val_size = 0;
+                /* New/alternate API that provides size */
+                ret_val = flb_wasm_call_function_format_msgpack_ex(
+                           wasm, ctx->wasm_function_name,
+                           tag, tag_len, log_event.timestamp,
+                           buf, buf_size, &ret_val_size);
+                ret = flb_log_event_encoder_set_body_from_raw_msgpack(
+                        &log_encoder, ret_val, ret_val_size);

If an _ex variant doesn’t exist, expose return size via an out parameter or a getter.


428-431: Typo in user-facing config help ("Sepecify").

Fixing user-visible docs.

-     "Sepecify the ingesting event format for wasm program"
+     "Specify the ingesting event format for the WASM program"
🧹 Nitpick comments (3)
cmake/windows-setup.cmake (2)

25-25: Fix typo in status message ("setttings").

Minor but user-visible in build logs.

Apply:

-  message(STATUS "Overriding setttings with windows-setup.cmake")
+  message(STATUS "Overriding settings with windows-setup.cmake")

8-15: Harden arch detection (cover ARM64EC) and improve fallback log.

ARM64EC shows up on some VS toolchains; include it explicitly. Also log the raw processor value on fallback.

-if(CMAKE_SYSTEM_PROCESSOR MATCHES "^(ARM64|AARCH64|arm64)$")
+if(CMAKE_SYSTEM_PROCESSOR MATCHES "^(ARM64|AARCH64|arm64|ARM64EC|arm64ec)$")
   set(FLB_ARCH "arm64")
 elseif(CMAKE_SIZEOF_VOID_P EQUAL 8)
   set(FLB_ARCH "x64")
 elseif(CMAKE_SIZEOF_VOID_P EQUAL 4)
   set(FLB_ARCH "x86")
 endif()

-if(NOT FLB_ARCH)
-  message(WARNING "Unknown architecture; falling back to x64 heuristics")
+if(NOT FLB_ARCH)
+  message(WARNING "Unknown architecture '${CMAKE_SYSTEM_PROCESSOR}'; falling back to x64 heuristics")
   set(FLB_ARCH "x64")
 endif()

Also applies to: 17-23

plugins/filter_wasm/filter_wasm.c (1)

144-147: Make the skip log message format-agnostic.

Current text mentions JSON even for msgpack path.

-            flb_plg_debug(ctx->ins, "encode as JSON from msgpack is broken. Skip.");
+            flb_plg_debug(ctx->ins, "WASM function returned NULL; skipping record.");
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c74ef23 and 9b85817.

📒 Files selected for processing (2)
  • cmake/windows-setup.cmake (3 hunks)
  • plugins/filter_wasm/filter_wasm.c (8 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-08-31T12:46:11.920Z
Learnt from: ThomasDevoogdt
PR: fluent/fluent-bit#9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.920Z
Learning: In fluent-bit CMakeLists.txt, the system library preference flags are defined as FLB_PREFER_SYSTEM_LIB_ZSTD and FLB_PREFER_SYSTEM_LIB_KAFKA with the FLB_ prefix.

Applied to files:

  • cmake/windows-setup.cmake
📚 Learning: 2025-08-31T12:46:11.920Z
Learnt from: ThomasDevoogdt
PR: fluent/fluent-bit#9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.920Z
Learning: In fluent-bit, the correct CMake flag for using system librdkafka is `FLB_PREFER_SYSTEM_LIB_KAFKA=ON`.

Applied to files:

  • cmake/windows-setup.cmake
📚 Learning: 2025-08-31T12:46:11.920Z
Learnt from: ThomasDevoogdt
PR: fluent/fluent-bit#9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.920Z
Learning: In fluent-bit, the correct CMake flag for using system zstd is `PREFER_SYSTEM_LIB_ZSTD=ON`, not `FLB_PREFER_SYSTEM_LIB_ZSTD=ON`.

Applied to files:

  • cmake/windows-setup.cmake
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (29)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
  • GitHub Check: PR - fuzzing test
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-centos-7
🔇 Additional comments (6)
cmake/windows-setup.cmake (1)

38-44: WASM gating by Windows arch looks correct.

arm64 disabled; x64/x86 enabled; matches PR intent.

Also applies to: 134-140

plugins/filter_wasm/filter_wasm.c (5)

58-76: Good: persistent instance retrieval + early NOTOUCH guard.

Prevents crashes when init failed; minimal overhead.


120-122: OK to avoid destroying persistent WASM on per-record errors; add defensive buf cleanup.

On MSGPACK error, if flb_wasm_format_msgpack_mode ever allocates before failing, buf could leak. Also set buf=NULL after frees to prevent accidental double-free across iterations.

-            else {
+            else {
                 flb_plg_error(ctx->ins, "encode as JSON from msgpack is failed");
-                /* Go to on_error without destroying the persistent wasm instance */
-                goto on_error_without_wasm_destroy;
+                /* Cleanup then go to on_error without destroying the persistent wasm instance */
+                if (buf != NULL) { flb_free(buf); buf = NULL; }
+                goto on_error_without_wasm_destroy;
             }
@@
-            if (ret < 0) {
+            if (ret < 0) {
                 flb_plg_error(ctx->ins, "format msgpack is failed");
-                /* Go to on_error without destroying the persistent wasm instance */
-                goto on_error_without_wasm_destroy;
+                if (buf != NULL) { flb_free(buf); buf = NULL; }
+                goto on_error_without_wasm_destroy;
             }

Additionally (outside these ranges), after successful frees add nulling to avoid cross-iteration reuse:

-                flb_free(buf);
+                flb_free(buf);
+                buf = NULL;
@@
-            flb_free(buf);
+            flb_free(buf);
+            buf = NULL;

Also applies to: 131-132, 240-245


381-391: Confirm thread-safety of shared flb_wasm instance.

If filter callbacks can run on multiple workers, WAMR/your flb_wasm wrapper must be re-entrant. If not, add a per-ctx lock or gate the plugin to single worker.

I can follow up with a patch adding a mutex around flb_wasm_call_function_* if flb_wasm isn’t thread-safe. Want that?


338-341: Good: explicit NULL init for ctx->wasm.

Prevents accidental deref before instantiation.


410-422: Cleanup path looks correct.

Destroys instance and config once; safe on NULLs.

@cosmo0920 cosmo0920 added this to the Fluent Bit v4.1 milestone Sep 2, 2025
@edsiper edsiper merged commit 35885ce into master Sep 2, 2025
56 checks passed
@edsiper edsiper deleted the cosmo0920-optimize-filter_wasm-plugin-for-high-throughput branch September 2, 2025 12:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants